A spectro-temporal modulation index (STMI) for assessment of speech intelligibility
نویسندگان
چکیده
We present a biologically motivated method for assessing the intelligibility of speech recorded or transmitted under various types of distortions. The method employs an auditory model to analyze the effects of noise, reverberations, and other distortions on the joint spectro-temporal modulations present in speech, and on the ability of a channel to transmit these modulations. The effects are summarized by a spectro-temporal modulation index (STMI). The index is validated by comparing its predictions to those of the classical STI and to error rates reported by human subjects listening to speech contaminated with combined noise and reverberation. We further demonstrate that the STMI can handle difficult and nonlinear distortions such as phase-jitter and shifts, to which the STI is not sensitive. 2002 Published by Elsevier B.V.
منابع مشابه
The Role of Temporal Fine Structure Cues in Speech Perception
In this thesis, the importance of temporal fine structure (TFS) in speech perception is investigated. It is well accepted that TFS is important for sound localization and pitch perception, while envelope (ENV) is primarily responsible for speech perception. Recently, a significant contribution of TFS in speech perception has been suggested. This was linked to the improved ability of normal-hear...
متن کاملAutomatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations
Oral, head and neck cancer represents 3% of all cancers in the United States and is the 6th most common cancer worldwide. Depending on the tumor size, location and staging, patients are treated by radical surgery, radiology, chemotherapy or a combination of those treatments. As a result, their anatomical structures for speech are impaired and this leads to some negative impact on their speech i...
متن کاملModeling spectro-temporal modulation perception in normal-hearing listeners
The ability of human listeners to detect and discriminate spectro-temporal ripples in sound has been shown to be correlated with speech intelligibility performance in several conditions. Thus, if a model would be able to account for the spectro-temporal processing limits in the auditory system, such a framework could be used to analyze the auditory processes contributing to and limiting speech ...
متن کاملSpeech-nonspeech discrimination using the information bottleneck method and spectro-temporal modulation index
In this work, we adopt an information theoretic approach the Information Bottleneck method to extract the relevant spectrotemporal modulations for the task of speech / non-speech discrimination non-speech events include music, noise and animal vocalizations. A compact representation (a “cluster prototype”) is built for each class consisting of the maximally informative features with respect to ...
متن کاملPredicting speech intelligibility in conditions with nonlinearly processed noisy speech
The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 41 شماره
صفحات -
تاریخ انتشار 2003